morpion solitaire
Tackling Morpion Solitaire with AlphaZero-likeRanked Reward Reinforcement Learning
Wang, Hui, Preuss, Mike, Emmerich, Michael, Plaat, Aske
Morpion Solitaire is a popular single player game, performed with paper and pencil. Due to its large state space (on the order of the game of Go) traditional search algorithms, such as MCTS, have not been able to find good solutions. A later algorithm, Nested Rollout Policy Adaptation, was able to find a new record of 82 steps, albeit with large computational resources. After achieving this record, to the best of our knowledge, there has been no further progress reported, for about a decade. In this paper we take the recent impressive performance of deep self-learning reinforcement learning approaches from AlphaGo/AlphaZero as inspiration to design a searcher for Morpion Solitaire. A challenge of Morpion Solitaire is that the state space is sparse, there are few win/loss signals. Instead, we use an approach known as ranked reward to create a reinforcement learning self-play framework for Morpion Solitaire. This enables us to find medium-quality solutions with reasonable computational effort. Our record is a 67 steps solution, which is very close to the human best (68) without any other adaptation to the problem than using ranked reward. We list many further avenues for potential improvement.
- Europe > Netherlands > South Holland > Leiden (0.05)
- Asia > China (0.05)
- North America > United States > New York (0.04)
Nested Monte-Carlo Search
Cazenave, Tristan (Université Paris-Dauphine)
Many problems have a huge state space and no good heuristic to order moves so as to guide the search toward the best positions. Random games can be used to score positions and evaluate their interest. Random games can also be improved using random games to choose a move to try at each step of a game. Nested Monte-Carlo Search addresses the problem of guiding the search toward better states when there is no available heuristic. It uses nested levels of random games in order to guide the search. The algorithm is studied theoretically on simple abstract problems and applied successfully to three different games: Morpion Solitaire, SameGame and 16x16 Sudoku.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)